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Abstract The Spaceport Processing Systems Branch at NASA Kennedy Space 
Center has designed, developed, and deployed a rule-based agent to monitor the 
Space Shuttle's ground processing telemetry stream. The NASA Engineering 
Shuttle Telemetry Agent increases situational awareness for system and hardware 
engineers during ground processing of the Shuttle’s subsystems. The agent 
provides autonomous monitoring of the telemetry stream and automatically alerts 
system engineers when user defined conditions are satisfied. Efficiency and safety 
are improved through increased automation. 


Sandia National Labs’ Java Expert System Shell is employed as the agent’s rule 
engine. The shell's predicate logic lends itself well to capturing the heuristics and 
specifying the engineering rules within this domain. The declarative paradigm of 
the rule-based agent yields a highly modular and scalable design spanning multiple 
subsystems of the Shuttle. Several hundred monitoring rules have been written 
thus far with corresponding notifications sent to Shuttle cnginccre. This chapter 
discusses the rule-based telemetry agent used for Space Shuttle ground processing. 
We present the problem domain along with design and development 
considerations such as information modeling, knowledge capture, and the 
deployment of the product. We also present ongoing work with other condition 
monitoring agents. 


Keywords. Agent, monitoring, rule-based expert system 


Introduction 


1. Background 

NASA Kennedy Space Center (KSC) is responsible for pre-launch ground checkout of 
the Space Shuttle. The Launch Processing System (LPS) at KSC provides facilities for 
NASA Shuttle system engineers, contractors, and test conductors to command, control, 

1 Correspondence to: Glenn S. Semmel, NASA, YA-D8, Kennedy Space Center, FL 32899. Tel.: +1 
321 861 2267; E-mail: Glenn.S.Semmel@nasa.gov. 
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Figure 1. Ground Control and Monitoring at NASA fCSC 

and monitor space vehicle systems from the start of Shuttle interface testing through 
various phases including terminal countdown, launch, abort, safing, and scrub 
turnaround. 

LPS continually monitors the Shuttle and its ground equipment including 
environmental controls and hardware that loads propellants. Consoles with vehicle 
responsibilities communicate information directly to and from the Shuttle computer 
systems. Consoles with ground support equipment responsibility communicate 
information to and from the hardware interface modules which are connected to the 
numerous ground support systems. See Figure 1. Each module is capable of 
interfacing to approximately 240 sensors or controls. Overall, some 50,000 
temperatures, pressures, flow rates, liquid levels, turbine speeds, voltages, currents, 
valve positions, switch positions, and many other parameters must be controlled and 
monitored. 

Using LPS, NASA Shuttle engineers and contractors at KSC are responsible for 
certifying that ground checkout of the Space Shuttle has been performed according to 
program specifications. The Operations and Maintenance Requirements and 
Specifications Document[2] lists those procedures. For over 25 years, engineers have 
used LPS to verify Space Shuttle flight readiness and to control launch countdown. 
LPS has performed superbly well. Recently, much of the LPS hardware was upgraded 
assuring its continuance for many more years. However, the system architecture was 
not changed and software remains basically the same. As a result, the level of 
situational awareness has not increased proportionally to what would otherwise be 
possible with more modem software technologies. 

After the Shuttle Columbia disaster on February 1, 2003, the Columbia Accident 
Investigation Board[3] proposed recommendations to improve safety from both an 
organizational and technical perspective. The Board indicated the need to “[adopt] and 
maintain a Shuttle flight schedule that is consistent with available resources.” Also, 
both management and engineering support staff must maintain an awareness of 
anomalies and those must not be lost “as engineering risk analyses [move] through the 
process.” Given two tragic losses of a crew and Shuttle, today NASA engineers have 
an even greater pressure to be more vigilant in identifying problems. At KSC, ground 
processing of the Shuttle is performed by thousands of employees, both contractors and 
civil servants. Anomalies must be detected and reported to prevent problems with 
Shuttle subsystems, countdown, and launch. The aging LPS hardware has limited 







resources and precludes the level of automation and notification warranted by this 
domain. 

Contractors at KSC are responsible for the day to day operations, checkout, and 
maintenance of the Shuttle. They are the primary users of LPS. NASA Shuttle 
engineers are civil service employees who oversee the contractors. Given the 
limitations and resource scarcity of LPS, NASA Shuttle engineers needed a tool to 
provide more insight and situational awareness and oversee the work performed by 
contractors. An increased insight could help detect anomalies that might otherwise go 
unnoticed, whether by process error, software or hardware failures in the monitoring 
equipment, or many other possible causes. A tool was needed to complement LPS that 
could autonomously and continuously monitor Shuttle telemetry data and automatically 
alert NASA Shuttle engineers when predefined criteria have been met. In the latter half 
of 2003, a software tool was proposed to provide better insight into Shuttle ground 
processing and increase the level of situational awareness. This tool is known as the 
NASA Engineering Shuttle Telemetry Agent (NESTA). 

1.1. Objectives 

Data processed by LPS is distributed on a local area network. As shown in Figure 1, 
the distributed data is known as the Shuttle Data Stream (SDS)[4] and contains real- 
time vehicle ground processing data. It is used by monitor-only applications. The 
primary objective of NESTA is to provide full time autonomous monitoring of the SDS 
and to automatically alert NASA engineers in near real-time when pre-defmed criteria 
have been met. Types of monitoring criteria include expected operational events or 
milestones (e.g. vehicle power up, start of launch countdown test, etc.) as well as 
unexpected events or failures (e.g. large difference between redundant sensor values). 
NESTA allows Shuttle engineers to work on other tasks while minimizing the risk of 
losing awareness of real-time Shuttle processing data and events. 

NESTA acts as a software agent for the NASA engineer. For this discussion, an 
agent is defined as rule-based, autonomous software that reacts to its environment and 
communicates results to a human, a NASA engineer in this usage. Agents have been 
extensively researched[5][6]. Agent standards[7] and frameworks[8][9] have also been 
developed. 

The primary objectives for NESTA include: 

• Allow a NASA engineer to specify rules to be applied to measurements 
published in the SDS. 

• Generate near real-time notifications and alerts in the form of emails or 
wireless pages. Notifications may include a text message and measurement 
values, and may be sent to multiple users when the rule's premises are satisfied. 

• Monitor up to four separate SDS sources. This includes four control rooms 
used for checkout and launch of the Shuttle and its components. 

• Process multiple types and subtypes of measurements including discretes (i.e. 
boolean measurements), analogs (i.e. floating point measurements), and digital 
patterns (i.e. integer measurements). 

• Allow users to create and modify multiple monitoring requests without 
restarting NESTA. 



1.2. Why an AI Solution 

NESTA leverages various AI technologies within a rule-based paradigm including 
forward chaining, fast pattern matching, declarative programming, predicate logic, and 
more. AI was a natural fit for monitoring the SDS since pattern recognition and 
analysis are the primary needs. Although pattern identification could be achieved by 
employing regular expression libraries within various procedural and object oriented 
languages, those paradigms are not specifically intended for this type of application and 
have less efficient matching algorithms. The pattern matching algorithms of rule- 
based expert system shells are highly specialized and tuned. Also, AI, particularly rule- 
based languages, lends itself better to this domain since pattern recognition wrapped 
within a premise-action construct closely mirrors the level of abstraction at which the 
domain experts work. 

The type of data signatures sought by Shuttle engineers requires the derivation of 
rules that are of the same granularity as those typically used in rule-based languages. 
Fortunately, Shuttle engineers were already accustomed to representing knowledge at a 
fine grained level. The engineers are adept at either constructing the rules themselves 
or expressing the knowledge in pseudo code that lends itself well for translation 
directly into declarative rules. Many of the rules are either standalone or work in 
conjunction with several other rules. This suggests a highly modular system with a rule 
being a suitably sized working block. 

1. 3. Other Attempted Solutions 

NESTA is a peripheral advisory tool to the real time control system within LPS. There 
were three previous projects that attempted to upgrade LPS in the last 15 years. Even 
though those efforts had significantly greater objectives that spanned well beyond just 
advisory applications, they were advertised to include many of the capabilities that 
NESTA provides and much more. Approximately half a billion dollars was spent on 
those efforts and upwards of 600 people worked on the most recent of those upgrade 
attempts. There were various technical and political hurdles that initially impeded and 
then ultimately doomed those full scale replacements of LPS. 

NESTA's infusion of state-of-the-art AI technologies and engineering within the 
legacy launch system, LPS, is particularly notable given the number and size of the 
preceding attempts to modernize the ground control system at KSC. Those fallen 
projects, despite having much grander objectives, had little to no spin-offs within the 
LPS community. In contrast, NESTA is becoming accepted and internalized by 
members of the launch team and appears to be on its way as a widely used tool. From 
a business vantage point, NESTA's greatest asset is its development and marketing as a 
value added product. That is helping pave its path to acceptance. 


2. Application Description 

2.1. System Components and How They > Interact 

Figure 2 shows the context diagram for NESTA. The agent process is represented in 
the middle circle. It communicates with various sources and data stores. A 




Figure 2. NESTA Context Diagram 

measurement database is used to decode the SDS into usable measurements. The SDS 
source broadcasts measurements as data packets over local area networks NESTA 
monitors this stream for data patterns specified by the Shuttle engineers. If a pattern is 
matched, a notification is sent as an email or wireless page. The Rules data store 
represents the Jess scripts and knowledge base that defines the rules for the monitoring 
criteria. All messages and relevant agent activities are also locally logged. 

2.2. Languages and A1 Tools Used in Application 

The Java Expert System Shell (Jess)[10] was selected as the rule engine. Jess was 
developed and supported by another government agency, Sandia National Labs. As 
such, our development team and customer have full usage of the tool via government 
licensing without any fees. This includes access to all the Jess source code. 

Jess’ forward chaining reasoning system was modeled after production systems 
such as CLIPSfl 1] and OPS5[12], It contains highly efficient and sophisticated 
pattern matching based on the Rete algorithm[13]. This enables its inference engine to 
process many rules and data rapidly. The engine repeatedly processes through a match- 
select-act cycle. As a production system, its consequents can be actions. A conflict 
resolution strategy determines the precedence of rule firings. 

Several hundred monitoring rules have been written thus far for monitoring Shuttle 
ground telemetry. Jess' predicate logic lends itself to capturing and specifying the 
heuristics and engineering rules of this spaceport domain. The declarative paradigm of 
this rule-based agent also makes it highly modular and scalable to span multiple 
subsystems of the Shuttle. Jess also includes a fourth generation scripting language and 
interactive command line which are very conducive for prototyping and testing. 

Jess is written entirely in Java and has access to the full Java application 
programming interface from the scripting language. It provides standard control flow 
constructs and supports variables, strings, objects, and function calls. Jess 
automatically converts between its own types and Java types insulating the developer 
from manually performing the conversions. Its use as a Java library made Jess' 
selection more appealing since Java supports multiple platforms with its “write once 
run anywhere” paradigm. Beyond that, the need for NESTA to support web enabled 
clients also made Java a natural fit given its origins and strong support for developing 
Internet based applications. 







Figure 3. Sequence Diagram Illustrating Update to Jess Working Memory from Shuttle Data Stream 


2.3. Design 

Java classes were developed to parse and decode the data stream and represent 
measurements as facts in Jess' working memory. To interface Jess' rule engine with the 
SDS, each data measurement is modeled and implemented as a Java bean[14]. Java 
beans provide a component architecture to enable easier integration of applications. A 
property change notification mechanism is supported that allows one object to become 
a registered listener of another object. The listener object will then automatically 
receive changes from the source object. This is also known as a publish-subscribe or 
observer pattem[15]. Within Jess, each Java bean corresponds to what is known as a 
shadow fact. A Jess shadow fact is a mirror image of a Java bean, such as a pressure 
measurement, within Jess' working memory. All shadow facts are registered listeners 
of their Java bean counterparts. Thus, whenever a measurement changes in the data 
stream, a property change event is automatically generated for the given measurement 
and its sibling shadow fact is updated in Jess' working memory. Figure 3 illustrates 
this path. 

After a shadow fact is updated, the Jess pattern matcher will determine if the 
premises of any rules match the new or modified facts. Rules are compared to working 
memory to identify premises that are matched by the data in working memory. For 
NESTA, this data represents measurements from the SDS and rules represent data 
monitoring criteria submitted by NASA Shuttle and system engineers. Rules with 
matching premises are activated and placed onto an agenda. .Next, the agenda is 
ordered according to Jess' default conflict resolution strategy. The highest priority rule 
is then fired and executed. This match-select-act cycle repeats until no more rules are 
available to fire. An action handler class was developed and is used to build and send 
the notification message to the Shuttle engineer whenever a rule fires. 

2. 4. Knowledge Capture and Representation 

Figure 4 shows the knowledge acquisition workflow for creating or modifying a rule to 
monitor specific measurements on the Shuttle data stream. The Shuttle engineer must 
specify who is responsible for the rule, the contents of the email notifications, the rule's 
firing conditions (i.e. antecedent, left hand side), and rearming conditions. That is, 
some rules may need to have a “one shot” behavior and only fire once when activated 
the first time. Other rules may need to be re-armed after a given time period or when 
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Figure 4. NESTA Knowledge Acquisition Workflow 
certain types of conditions are met. 

The current version of NESTA does not have a graphical user interface capturing 
this workflow, but all of the steps are effectively provided within script files. Those 
files are editable with a plain text editor by the end users. Hundreds of rules have been 
produced by the customer. 

As the rule database grew, patterns of rules began to emerge. Patterns in software 
design and modeling have been extensively investigated and reported[ 15 ]. Analogous 
to those design patterns, the development team and customer began recognizing 
knowledge patterns for this domain and developed rules following these structures. 
Some patterns include: 

• One shot: Rule fires once regardless of how many times facts cause the 
premise to reactivate. 

• Recurring: Rule fires each time the premise reactivates. 

• Timed: Rule fires every X minutes as premise remains true. 



• Queued: Multiple rules will fire but notifications are sent to a queue that gets 
flushed based on a user configurable amount of time or maximum number of 
firings. One composite notification is sent when the queue is flushed. That 
composite notification contains what would have otherwise been multiple 
emails or wireless pages. 

Some sample rules in English prose include: 

• Notify Shuttle Engineer when measurement V79S4126E1 or V79S4132E1 or 
V79S4138E1 or V79S4143E1 equal ON. Indicates that Flight Control Power 
(ASA 1-4) has been activated. 

• Notify Shuttle Engineer when measurement V90Q8001C1 equals 801. 
Indicates that a Shuttle is in orbit and is preparing to initiate the on-orbit flight 
control checkout activity. 

• Notify Shuttle Engineer every 60 minutes with current values of Flight Control 
launch countdown measurement list when measurement NMAJORTEST equals 
7. Indicates launch countdown test is occurring. While in launch countdown 
test, send a current value email containing a list of Flight Control 
measurements every hour. 

• Notify Shuttle Engineer when FD N79IV019D Bit masked 0x0001 equals 1. 
Indicates that an LPS command and control program has stopped due to a 
failure and is waiting on the operator for action. 

This is an actual NESTA rule written in the Jess scripting language: 


(defrule vehicle-pwr-on-rule 

"Orbiter electrical power is up." 

(recipient-list (recipient-list-name vehicle-pwr-on-rule) ) 
?notPowered <- (vehicle -no t-powered) 

(DigitalPatternFd (fdName "NORBTAILNO" ) ) 

(AnalogFd (fdName "V76V0100A1") (valid TRUE) (value ?vall) ) 

(AnalogFd (fdName "V76V0200A1") (valid TRUE) (value ?val2) ) 

(AnalogFd (fdName "V76V0300A1") (valid TRUE) (value ?val3) ) 

(test 

(and 

(> ?vall 26.0) 

(> ?val2 26.0) 

(> ?val3 26.0) 

) 

) 

=> 

(retract ?notPowered) 

(assert (vehicle-powered) ) 

(notifyActionHandler nil nil) 
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To: 

Cc: 

SLtojBCt: [NESTA] FBI, S8121H. 02Aug20C5, 03:14:15 local 


Sent Tue 6/2/2005 3:14 A,M 


Drbiter electrical power is up. 

214:0713/23.571 : V76V0100A1 {MAIN BUS A VOLTAGE} is 29.599995 V. 

214:0713/19.411 : V76V0200A1 {MAIN BUS B VOLTAGE} is 29.599995 V. 

214:0713/23.651 : V76V0300A1 {MAIN BUS C VOLTAGE} is 28.639994 V. 

208:1735/31.120 : NORBTAILNO {ORBITER TAIL NUMBER} is 104 (DEC) was 0. 

NASA Engineering Shuttle Telemetry Agent (NESTA) v0.6 
supporting FR1, SB121H started 27Jul2005, 13:34:08 local. 

This is an uncertified advisory application and is not to be 
used as the only means of data verification. 


Figure 5. Email Generated by NESTA 

For this rule, if all three analog bus voltage measurements, V76V0100A1, 
V76V0200A1, and V76V0300A1, concurrently exceed 26 volts, the Shuttle Orbiter is 
considered to be powered on. Another indicator, SOIADATAV, is used to assure the 
validity of the incoming data. Finally, another measurement, NORBTAILNO, is 
located on the rule s left hand side. In our terminology, we call this an informational 
measurement as its specific value has no bearing on whether the rule fires, but it is 
necessary to include it on the rules left hand side so that it becomes part of Jess' 
activation object and then its value is included in the notification. The action handler 
parses the fields in the activation object and builds an email with all of the 
measurements' values that were listed on the left hand side of the rule. The 
notif yActionHandler call has two arguments that allow for the notification to be 
queued. This particular example does not use queuing and simply passes nil 
arguments in the call. Queuing is discussed later in the chapter. 

Figure 5 shows an email that was generated for the preceding rule. As illustrated, 
the exact values of all three bus voltages are listed along with the informational 
measurement showing which of the three Orbiters was powered up. In this case, 103 
refers to Discovery. The informational measurement proves useful in not only 
allowing the Orbiter reference to be included in the email, but it does not bind the rule 
to a particular Orbiter. That is, NASA Shuttle engineers are interested in any Orbiter 
that may become powered up. The rule's pattern matching provides that level of 
genericity in a very straight forward representation. Of course, the engineer may be 
interested in being notified only about a specific Orbiter. This would require a simple 
modification to the rule. One additional slot would be referenced in the 
DigitalPatternFd template narrowing the focus to a particular Orbiter. Thus, 
minor modifications to the rule demonstrate the rich behavior available to the Shuttle 
engineer and show the semantic power of pattern matching. 


2.5. Hardware and Software Environment 

The NESTA application resides on a Dell 1.7 GHz Pentium server. The server includes 
the necessary user and support files such as the facts scripts, rules scripts, measurement 
database, logs, and more. Currently, the server executes on a Microsoft Windows 2000 
operating system. However, since Java was used exclusively along with its virtual 
machine, the ability to execute software on other types of servers is readily available. 
Again, this was a primary driver in the selection of Java and Jess so as to not be bound 
to a particular hardware platform or operating system. Customers receive notification 
on standard email clients including Windows workstations, wireless pagers, personal 
digital assistants, cell phones, and more. 

2. 6. Performance Requirements and Testing 

2.6.1. Performance Characteristics of Shuttle Data Stream 


At application startup, NESTA connects to a datastream selected by the user. The 
datastream includes all measurements at their respective change rates. No data changes 
will be missing from this stream. For this discussion, only the FIFO stream will be 
presented as it is the stream of choice for the NESTA customer. 

The datastream averages 5 to 10 packets per second and peaks around 50 packets 
per second at launch. Each SDS data packet can hold up to 360 measurement changes 
before rolling over to another packet. This calculates to an average of 1,800 changes 
per second for the FIFO stream nominally, and 18,000 changes per second peak at 
launch. During peak data loads, the SDS is throttled at the source and does not 
maintain true real time updates. It may lag up to 1 minute or so, but all measurement 
changes are buffered and none is ever dropped from the data stream. Throttling of the 
data typically begins at T+l second, that is, just after launch. Even though it is the 
hypothetical peak limit, 18,000 changes per second is the performance load that 
NESTA is expected to meet to avoid missing a measurement change. This is referring 
strictly to updating 1 8,000 facts per second and not indicating how many rules might 
fire. In fact, only a small percentage of those facts is expected to result in a small 
percentage of the total rules to fire at any given time, even during the peak launch data 
rates. 

The measurement data in the stream is refreshed every three minutes regardless as 
to whether or not it has changed. Since the stream is based on User Datagram Protocol 
(UDP), this results in an unreliable datagram packet service. When a packet is dropped 
on the network, all measurements are marked invalid and the measurements change 
back to valid one by one as refresh data is received until the completion of a three 
minute refresh cycle. 

2.6.2. Performance Testing 

Performance testing occurred on an Intel Pentium 4, 1 .7 GHz desktop workstation with 
768 MB of RAM running Microsoft Windows XP Professional. The SDS reader class 
in NESTA parses the data stream and updates facts in Jess' working memory. To test 
the reader class, 12 high speed analog measurements were selected and instantiated as 
shadow facts. In the range of 18,000 (nominal) to 36,000 (peak at launch) data changes 
occurred every second in the test-enhanced data stream and were processed by the SDS 



analog 19 onn , f V f‘° US types of measurements such as discretes and 
analogs. 12,000 analog data changes per second were being processed into current 
values and updated in Jess working memory by a property change event handler. 

Ru es were written for 6 of the high speed analog measurements. The other 6 
measurements were still relevant to stress the SDS reader class and updating of facts 5 
of the 6 rules fired once every minute. The 6th rule fired once for every single 
measurement change (1,000 per sec) for two full seconds sustained out of every minute 
Thus, a total of 2005 rules fired every minute, with 2000 of them firing within a 2 
second period. Analog measurements have considerably more processing overhead 
than the discrete measurements so it was not possible to sustain thousands of rules 
containing analogs to fire every second without causing CPU starvation. However the 

thata^in Z S ( !° nSlder ® d 1 t .° have onIy a ver y s ma» percentage of the measurements 
that are in the stream actually causing rules to fire. It was considered fair to have short 

burstsofh'gh raterulefinngs but not long term sustained high rate rule firings. 

® TA .' S lntende l d for -ers to write rules to notify them via email hundreds or 
thousands of times each second for a long and sustained period of time. 

To summarize, NESTA sustained the above scenario for many cycles on the test- 

TteTpHutilizat n C f U Starvation and without re P^mg any packet losses. 

h , , CP 1 U utllizatl °n on the development workstation was about 90% prior to launch 

and higher than that after T-0. It was heavily loaded, but NESTA maintained the pace 
NESTA performed well considering that the data stream was stuffed with between 1 
“ d 2 T ’ m ® S . the hypothetical peak load of measurement changes for the performance 
test The long pole in the process appeared to be the number of rules that actually 
ired every second sustained. However, even under launch conditions when a heavy 
data change load exists, there is not expected to be many thousands of rules firing every 
second. Even several hundred rules firing per minute is considered unrealistically high 

but this performance test suggests NESTA could readily handle that load ’ 


3. Development and Deployment 

3.1. Application Use and Payoff 


At the time of writing of this chapter, the customer had used NESTA for over a year 
Hundreds of rules have been written. Along with that, hundreds of NESTA 
notificationshave been generated for multiple NASA engineers. These users have 
received both emails and wireless pages at KSC and other remote sites. Since the 
customer is a NASA engineer responsible for oversight of contractors, the notifications 
act as an extra set of eyes that further assure the quality of government oversight. 

To better understand NESTA's payoff, the responsibilities of NASA Shuttle 
Engineers must be examined. They include: 


• Understanding their system and supporting equipment. 

• Knowing how their systems are tested and processed. 

• Being aware of when their systems are activated, tested, or in use. 

• Analyzing performance and data retrievals from any use of a system 

• Being ready to answer questions about their systems such as 



- When was it tested? 

- How did testing proceed? 

- How did the data look? 

- Is it ready to fly? 

NESTA has helped Shuttle Engineers meet these responsibilities in varying 
degrees. Below are three success stories documenting some of the benefits NESTA 
has provided. 

3.1.1. Success Story - Increased Situational Awareness 

In one usage, a Shuttle avionics system was powered up over a weekend. The NASA 
Shuttle Engineer, being responsible for that system, would not have been aware that the 
system was powered up except for receiving a NESTA notification. In this case, the 
avionics user was not part of the Shuttle Engineer's immediate organization. Thus, the 
Shuttle Engineer did not receive any communiques regarding the system's weekend 
usage. Due to NESTA, the Shuttle Engineer was better prepared to address questions 
about his system's usage were they to arise. This has not been an uncommon 
occurrence. Shuttle Engineers utilizing NESTA began realizing that some of their 
systems were being utilized much more than previously thought. Situational awareness 
increased markedly. 

3. 1.2. Success Story - Increased Efficiency 

Some ground operations span 24 hours and include dozens of asynchronous events that 
are broadcast on the data stream. For example, checkout of flight control hardware in 
the Orbiter Processing Facility occurred 4 to 6 times within the last year. The checkout 
included long hydraulic operations, powering up different parts of avionics, 
pressurizing/depressurizing the Orbiter, and other work. During a recent flow, the 
NESTA notifications gave exact times of events of interest to the Shuttle Engineer. 
That allowed the Shuttle Engineer to quickly identify timelines of these lengthy 
operations. Effectively, a virtual roadmap identifying significant events was 
automatically generated and that saved an hour of labor. More efficient data retrievals 
resulted. 

3.1.3. Success Story - Customer Testimonial 

Below are excerpts of an email received from a NESTA customer in April 2005. The 
testimonial details how NESTA notified a NASA engineer of a hardware inspection 
that was not previously known to be occurring. That notification provided an increased 
awareness and might have prevented a further delay in testing of Shuttle components. 


“NESTA earned its keep this weekend and I wanted to share the story with you . 

The Shuttle program has a very large test called S0008 - Shuttle Integrated Test. 
After the Orbiter is mated to the ET[external tank] and SRB[solid rocket booster] stack, 
S0008 is the first big power ON testing which performs numerous tasks mostly 
concerned with the integrated Shuttle vehicle. For example , the interaction between 
the Orbiter's avionics and the SRB's electro-hydraulic thrust vector control actuators. 



Due to significant technical problems with ET attach point pyros and the ET attach 
point electrical connections (the ’monoballs'), the schedule for S0008 fell completely 
apai t. What started as a 42 hour test operation has now consumed the entire weekend 
and will probably not be finished anytime soon. 

One of our NASA engineers came in for third shift Sunday to cover the testing. 
One important NASA function during this time period was star tracker light shade 
inspection. What happens in this test is that the star trackers are powered ON, the star 
tracker doors are opened, and then [the contractor] and NASA engineers inspect the 
inside of the star tracker - a cavity called a light shade which is a large cone coated 
with a black non-reflective coating and several baffles. The design of the light shade is 
to eliminate any and all extraneous light sources and reflections except for the star in 
view which the star tracker is trying to get a fix on. The inspection is made to make 
sure there is no foreign object debris. For example, a flake of paper could cause a 
reflection and lead to an erroneous star tracker star fix. If debris is found, special 
equipment is available to vacuum out the inside of the light shade. After this procedure 
the star tracker is powered OFF and the star tracker door is closed for the last time at 
KSC. 

Now here s where NESTA payed off. During this third shift operation yesterday, 
[the contractor] and NASA were all on center waiting on the word from the S0008 test 
conductors to perform the star tracker light shade inspection. For whatever reason, 
our NASA engineer was never notified when the checkout was to begin. [The 
contractor] began the checkout without attempting to notify NASA. The first indication 
the NASA engineer had was when NESTA sent an email to the engineer announcing 
that the star tracker was powered ON. At this point, the NASA engineer contacted the 
test conductor and directed him to keep the doors open until he could witness the 
internal cavity inspection. Without NESTA, NASA would have missed the star tracker 
inspection. And this would have led to an uncomfortable discussion about whether the 
test would have to be repeated or whether NASA could rely solely on the eyes of the 
[contractor] engineers. ” 

3.2. Phased Approach to Implementation and Delivery 

Multiple releases of NESTA have been delivered to the customer. The development 
team has four members each working approximately sixty percent of his time on the 
project. The team works very closely with the customer. Generally, the team meets 
with the customer at least once per week and has multiple other correspondences via 
email and phone. 

The initial NESTA release required six months. Thereafter, a release occurred 
approximately every month. Prior to adopting Java and Jess, some preliminary 
performance testing was completed to verify that the Java language and Jess rule 
engine were fast enough to handle the Shuttle data stream rates. Concurrently with that 
coarse performance testing, the initial set of requirements were being developed. 

The software process model employed is a combination of extreme programming 
and the iterative waterfall model. The team and customer understood the need to 
anticipate and accommodate changes in the requirements. The customer, as much of 
the development team, had little experience with rule based systems so there was a 
learning curve in how best to represent knowledge and interface the data stream with 
Jess. After about six months, a baseline set of requirements existed but the requirement 
space is still fluid and undergoes change over time. These changes are seen as a 



learning process through which we explore the possibilities of the system. As releases 
are delivered to the customer, new requirements are elicited and old ones may become 
defunct. 

3.3. Development Tools 

In addition to Java and Jess, other tools used include: 

• Eclipse as an integrated development environment. 

• Visio 2000 to develop Unified Modeling Language models. 

• CVS for configuration management. 

• Ant for automating builds. 

• JUnit for automated Java unit testing. 

• Emma for Java code coverage including measurements and reporting. 

• Optimizeit by Borland for profiling performance and detecting and isolating 
problems. 

3.4. Technical Difficulties 

3.4.1. Data Validity 

As indicated earlier in the chapter, the data stream is based on User Datagram Protocol 
(UDP). As such, the connection is not always reliable and packets may get dropped. 
This poses problems when rules are waiting for data to arrive. Data health and validity 
become questionable. If the data stream connection is lost entirely or data becomes 
stale (i.e. not updated), false positives or false negatives may result. That is, 
notifications of hardware events may never be sent or be sent in error. 

To partially address this data validity issue, additional measurements are included 
in the rules to check for the validity of the stream. Measurements are now marked 
invalid for a dropped packet(s) or when the source of the measurement becomes bad. 
There is still a larger problem of false negatives and never receiving an email if the data 
stream drops packets while a monitored event occurred. Aside from notifying the 
Shuttle engineer of a data loss when it happens, we have not yet identified a mechanism 
that guarantees all notifications since the data stream is unreliable. 

3.4.2. Measurement Databases Changes 

Multiple data streams and control rooms exist. . Often, the measurement database, 
which is used to decode the SDS, dynamically changes on the stream as a result of 
operations. When that happens, decoding measurements becomes impossible and facts 
can no longer be updated in Jess' working memory. A short term fix to this problem 
was to simply notify the NESTA system administrator when the stream changes. A 
measurement database Java bean was added and is used within a user rule as a fact. 
When the measurement database changes, the administrator automatically gets an email 
and may restart NESTA accordingly. Longer term, automatic restarts of the agent will 
be provided. 

3.4.3. Flood of Emails 

If an end user incorrectly writes a rule, a possibility existed of flooding the network and 
servers with hundreds or even thousands of notifications. To prevent that, multiple 




Figure 6. Web Application Maintenance Interface Summary Page 

During launch countdown, NASA Shuttle engineers are required to monitor shuttle 
telemetry data for violations of launch commit criteria (LCC) and to verify that the 
contractors troubleshoot problems correctly. When a violation is recognized by the 
system engineers it is reported to the NASA Test Director. The problem report, or call, 
includes a description of the problem, the criticality, whether a hold is requested, and 
whether a preplanned troubleshooting procedure exists. 

The Shuttle is composed of many subsystems (e.g. Main Propulsion, Hydraulics). 
Each of those subsystems has a team of engineers responsible for troubleshooting 
problems for that respective system during a launch countdown. Many systems have a 
large number of measurements with associated LCC limits and a large number of LCC 
requirements. 

Shuttle Engineers must monitor for many types of limit violations ranging from 
simple high and low limit boundaries to much more complex first order logic 
expressions. Each team has its own tools for identifying LCC violations. Many of 
these tools use the LPS software and simply change the color of the displayed data 



safeguards, such as user defined limits, were provided to filter emails after a given 
number have been generated for a particular email account. 

Beyond that possibility of user error, there was a separate need to queue emails 
that may be related to some sequence. Queuing provides a mechanism where multiple 
messages expected to occur within a short time period are grouped together before 
being emailed in bulk. For example, four flight control avionics boxes are often 
powered up in a short time period. Rather than a user receiving four separate flight 
control emails that may be interrelated, it was necessary to provide a queuing 
mechanism that allows a user to tie related emails to the same queue and receive one 
bulk email that was a compilation of what would otherwise be multiple emails. Both 
the queue time and queue length are configurable by the end user. 

3.5. Maintenance 

New releases are delivered approximately every month by the development team. 
Those releases may include bug fixes for problems reported in the former release. 
However, new releases are generally driven by new functionality as opposed to being 
driven by software errors. 

The design of the NESTA application facilities update by the end user. The 
application uses a data driven approach for the user files. All of the rules and facts are 
stored in Jess scripts. When rules have to be created or modified, the user has access to 
several text based files. A facts file allows a user to add measurements that should be 
monitored. A rules file allows the entry of new rules. Since these are text-based script 
files, no compilation is required by the end user. The files are parsed at application 
startup. This data driven approach is powerful in that it enables the end users to 
maintain their own files and not be at the mercy of the development team to add new 
support for new facts and rules. 

3.5.1. Web Application Maintenance Interface 

A Web Application Maintenance Interface (WAMI) was developed to aid the users in 
managing and monitoring the agent. WAMI is based upon JMX[16] and MX4J[17], 
Figures 6 and 7 show the Summary and Management Bean Views, respectively. The 
Summary View shows the current state of the agent, presenting information such as 
agent starting time, the data stream being monitored, the number of dropped packets, 
memory usage, and more. The Management Bean page shows a snapshot of the values 
of a particular set of measurements from the data stream and also allows the customer 
to query the value of any arbitrary measurement on the data stream. Further 
information is provided in other pages and views. 


4. Launch Commit Criteria Monitoring Agent 

Another agent using Jess has also been developed at NASA KSC. The Launch Commit 
Criteria Monitoring Agent (LCCMA)[18] identifies limit warnings and violations of 
launch commit criteria. As opposed to being used for day to day operations for which 
NESTA was developed, LCCMA’s scope is targeted for launch countdown activities. 




Figure 7. Web Application Maintenance Interface Management Bean View Page 

and/or present a text message to the user or set off an audible alarm. Troubleshooting 
may require other displays such as plots and troubleshooting flowcharts. Valuable time 
is spent locating these procedures and locating the data that supports them. 

With LCCMA, when a launch commit criteria violation is detected, the Shuttle 
engineer is notified via a Status Board Display on a workstation. Troubleshooting 
procedures are automatically made available on the Display. This precludes the Shuttle 
engineer from manually searching for the correct procedure mapped to the given 
violation. 

4.1. Graphical User Interface 

A graphical user interface currently exists for the Status Board Display. It is being 
upgraded and Figure 8 shows a storyboard representative of that future interface. The 


5. Conclusion and Future Work 


NESTA has increased situational awareness of ground processing at NASA KSC 
More and more Shuttle engineers are relying on NESTA each month and are creating 
additional rules for monitoring the data stream. The infusion of AI technologies 8 
particularly the Jess rule-based library, has proved very fruitful. Interfacing § and 
integrating these modem AI tools within a legacy launch system demonstrate 8 the 
scalability and applicability of the tools and paradigm. 

The knowledge patterns that are evolving within NESTA will make it easier to 
train new users and also allow faster creation of rules. Many other enhancements are 
planned such as providing an advanced graphical user interface for creating the rules. 

5. L Future Exploration Agents 

As indmated m the national Vision for Space Exploration); 19], an increased human and 
robotic presence will be cultivated in space, on lunar and Martian surfaces, and other 
destinations Spaceports will now span from the Earth to the Moon and beyond A 
. new set of challenges is presented by this Exploration Vision. In particular the need 
from EartlT y Slg C3ntly inCreases as P e °P‘ e and Payloads are sent greater distances 

Agents for these future applications will demand much higher degrees of 
autonomy than today’s Shuttle agents. Few or no human experts will reside at remote 
unar or Martian sites to correct problems in a timely manner. More automation will be 

of^reasonhig" 8 adVaDCed diagnostics and Prognostics. This requires higher levels 

mnlriSfX.M 11 systen \ and hardware engineers along with technicians leverage 
multiple skills when monitoring, diagnosing, and prognosticating problems in Shuttle 
ground support equipment. For the Exploration Vision, the need for extending these 
skills to support other vehicles and payloads at remote locations from the Earth to Mars 
becomes essential. These skills include being rational, collaborative, goal driven and 

chapt VItTZ LCCmT 3nd Un TT Th£ agentS di5CUS;ed earlier ^ 2 
chapter, NESTA and LCCMA, are capable of shallowing reasoning of short inference 

Whf r h | ,n Z ShUtt r domain - However > these existing agents can be endowed with 
higher levels of rationality enabling a deeper reasoning. We are investigating how to 

Exptata' S" ' ,a “ P ‘” , Explomi “ A ® ems < SEAs > ™ support of the 

, S EA -"‘ n ^ d ‘° communicate and collaborate along multiple and lengthy 
logistics chains. This does not simply include agents monitoring pre-flight checkout of 
vehtote a, a rerresinal y.eeptm (e g. NESTA monitoring Shunle .c fi vV s rS<? 
SEAs will reside in multiple locations at great distances. Logistics, scheduling and 
planning are just some of the activities that these agents will manage. 

Within this virtual collaborative management chain, SEAs will be inundated with 

Zm am ° U T ° f d3ta ? at , ™ USt be sorted and Processed. It becomes necessary for 
them to revise their sets of beliefs as new data arrives. It is simply not enough to revise 
singular data points within an agent's working memory and to have an agent blindly 
react to those changes. Rather, an agent must possess the ability to revise previously 
concluded assertions based on what may be now stale data. This activity is called truth 
X“„T" [20)I211[221 ; fl» too™ as belief revision, a.d is parrieuW “HpoS« 
when deep reasoning of long inferences is necessary. An assumption based truth 
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Figure 8. LCCMA Status Board Display 

Status Board Display shows the health of the network connection, data stream status, 
countdown time, and other relevant information. 

When LCC limits are violated, the LCC call is displayed in the text box. The user 
reads the text and, if there is an associated troubleshooting file, clicks the file button 
next to the text. This brings up a Troubleshooting Display for that particular LCC and 
limit. The LCC text remains bold until the Acknowledge button is pressed. Message 
text can be displayed with one of three icons representing a violation, warning, or 
informational cue. Measurements associated with the LCC may also be plotted. 

The text messages can be read over the Operational Intercommunication System as 
LCC calls during the countdown. Calls will change based on what limit is violated (e.g. 
warning, LCC, high/low limit), the time criticality of the call, and LCC effectivity. The 
agent aids the NASA engineer in making a Go/No-Go decision for launch. 


maintenance system (ATMS) can reason over many contexts simultaneously. By 
capturing, maintaining, and deploying spaceport expertise within ATMS-enabled SEAs, 
the costs and manpower required to meet the Exploration Vision are reduced while 
safety, reliability, and availability are increased. 
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